FedBench: A Benchmark Suite for Federated Semantic Data Query Processing

نویسندگان

  • Michael Schmidt
  • Olaf Görlitz
  • Peter Haase
  • Günter Ladwig
  • Andreas Schwarte
  • Thanh Tran
چکیده

In this paper we present FedBench, a comprehensive benchmark suite for testing and analyzing the performance of federated query processing strategies on semantic data. The major challenge lies in the heterogeneity of semantic data use cases, where applications may face different settings at both the data and query level, such as varying data access interfaces, incomplete knowledge about data sources, availability of different statistics, and varying degrees of query expressiveness. Accounting for this heterogeneity, we present a highly flexible benchmark suite, which can be customized to accommodate a variety of use cases and compare competing approaches. We discuss design decisions, highlight the flexibility in customization, and elaborate on the choice of data and query sets. The practicability of our benchmark is demonstrated by a rigorous evaluation of various application scenarios, where we indicate both the benefits as well as limitations of the state-of-the-art federated query processing strategies for semantic data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Federated SPARQL Query Processing Via CostFed

Efficient source selection and optimized query plan generation belong to the most important optimization steps in federated query processing. This paper presents a demo of CostFed, an index-assisted federation engine for federated SPARQL query processing. CostFed’s source selection and query planning is based on the index generated from the SPARQL endpoints. The key innovation behind CostFed is...

متن کامل

Developing a Benchmark Suite for Semantic Web Data from Existing Workflows

This paper presents work in progress towards developing a new benchmark for federated query processing systems. Unlike other popular benchmarks, our queryset is not driven by technical evaluation, but is derived from workflows established by the pharmacology community. The value of this queryset is that it is realistic but at the same time it comprises complex queries that test all features of ...

متن کامل

The Odyssey Approach for Optimizing Federated SPARQL Queries

Answering queries over a federation of SPARQL endpoints requires combining data from more than one data source. Optimizing queries in such scenarios is particularly challenging not only because of (i) the large variety of possible query execution plans that correctly answer the query but also because (ii) there is only limited access to statistics about schema and instance data of remote source...

متن کامل

A Heuristic-Based Approach for Planning Federated SPARQL Queries

A large number of SPARQL endpoints are available to access the Linked Open Data cloud, but query capabilities still remain very limited. Thus, to support efficient semantic data management of federations of endpoints, existing SPARQL query engines require to be equipped with new functionalities. First, queries need to be decomposed into sub-queries not only answered by the available endpoints, ...

متن کامل

HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation

Efficient federated query processing is of significant importance to tame the large amount of data available on the Web of Data. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. This work presents HiBISCuS, a novel hypergr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011